223 research outputs found

    Twenty Years of Student Scholarship: Celebrating the Dalhousie Journal of Legal Studies

    Get PDF
    In this paper, we propose a novel controllable text-to-image generative adversarial network (ControlGAN), which can effectively synthesise high-quality images and also control parts of the image generation according to natural language descriptions. To achieve this, we introduce a word-level spatial and channel-wise attention-driven generator that can disentangle different visual attributes, and allow the model to focus on generating and manipulating subregions corresponding to the most relevant words. Also, a word-level discriminator is proposed to provide fine-grained supervisory feedback by correlating words with image regions, facilitating training an effective generator which is able to manipulate specific visual attributes without affecting the generation of other content. Furthermore, perceptual loss is adopted to reduce the randomness involved in the image generation, and to encourage the generator to manipulate specific attributes required in the modified text. Extensive experiments on benchmark datasets demonstrate that our method outperforms existing state of the art, and is able to effectively manipulate synthetic images using natural language descriptions. Code is available at https://github.com/mrlibw/ControlGAN.Comment: NeurIPS 201

    EC Agricultural Prices. Price Indices and absolute prices-Quarterly Statistics 1-1993

    Get PDF
    We propose MAD-GAN, an intuitive generalization to the Generative Adversarial Networks (GANs) and its conditional variants to address the well known problem of mode collapse. First, MAD-GAN is a multi-agent GAN architecture incorporating multiple generators and one discriminator. Second, to enforce that different generators capture diverse high probability modes, the discriminator of MAD-GAN is designed such that along with finding the real and fake samples, it is also required to identify the generator that generated the given fake sample. Intuitively, to succeed in this task, the discriminator must learn to push different generators towards different identifiable modes. We perform extensive experiments on synthetic and real datasets and compare MAD-GAN with different variants of GAN. We show high quality diverse sample generations for challenging tasks such as image-to-image translation and face generation. In addition, we also show that MAD-GAN is able to disentangle different modalities when trained using highly challenging diverse-class dataset (e.g. dataset with images of forests, icebergs, and bedrooms). In the end, we show its efficacy on the unsupervised feature representation task

    Diagnosing and Preventing Instabilities in Recurrent Video Processing.

    Get PDF
    Recurrent models are a popular choice for video enhancement tasks such as video denoising or super-resolution. In this work, we focus on their stability as dynamical systems and show that they tend to fail catastrophically at inference time on long video sequences. To address this issue, we (1) introduce a diagnostic tool which produces input sequences optimized to trigger instabilities and that can be interpreted as visualizations of temporal receptive fields, and (2) propose two approaches to enforce the stability of a model during training: constraining the spectral norm or constraining the stable rank of its convolutional layers. We then introduce Stable Rank Normalization for Convolutional layers (SRN-C), a new algorithm that enforces these constraints. Our experimental results suggest that SRN-C successfully enforces stablility in recurrent video processing models without a significant performance loss

    Automatic cattle identification using graph matching based on local invariant features

    Get PDF
    Cattle muzzle classification can be considered as a biometric identifier important to animal traceability systems to ensure the integrity of the food chain. This paper presents a muzzle-based classification system that combines local invariant features with graph matching. The proposed approach consists of three phases; namely feature extraction, graph matching, and matching refinement. The experimental results showed that our approach is superior than existing works as ours achieves an all correct identification for the tested images. In addition, the results proved that our proposed method achieved this high accuracy even if the testing images are rotated in various angles.info:eu-repo/semantics/publishedVersio

    lp-Recovery of the Most Significant Subspace among Multiple Subspaces with Outliers

    Full text link
    We assume data sampled from a mixture of d-dimensional linear subspaces with spherically symmetric distributions within each subspace and an additional outlier component with spherically symmetric distribution within the ambient space (for simplicity we may assume that all distributions are uniform on their corresponding unit spheres). We also assume mixture weights for the different components. We say that one of the underlying subspaces of the model is most significant if its mixture weight is higher than the sum of the mixture weights of all other subspaces. We study the recovery of the most significant subspace by minimizing the lp-averaged distances of data points from d-dimensional subspaces, where p>0. Unlike other lp minimization problems, this minimization is non-convex for all p>0 and thus requires different methods for its analysis. We show that if 0<p<=1, then for any fraction of outliers the most significant subspace can be recovered by lp minimization with overwhelming probability (which depends on the generating distribution and its parameters). We show that when adding small noise around the underlying subspaces the most significant subspace can be nearly recovered by lp minimization for any 0<p<=1 with an error proportional to the noise level. On the other hand, if p>1 and there is more than one underlying subspace, then with overwhelming probability the most significant subspace cannot be recovered or nearly recovered. This last result does not require spherically symmetric outliers.Comment: This is a revised version of the part of 1002.1994 that deals with single subspace recovery. V3: Improved estimates (in particular for Lemma 3.1 and for estimates relying on it), asymptotic dependence of probabilities and constants on D and d and further clarifications; for simplicity it assumes uniform distributions on spheres. V4: minor revision for the published versio

    Relative Pose from Deep Learned Depth and a Single Affine Correspondence

    Get PDF
    We propose a new approach for combining deep-learned non-metric monocular depth with affine correspondences (ACs) to estimate the relative pose of two calibrated cameras from a single correspondence. Considering the depth information and affine features, two new constraints on the camera pose are derived. The proposed solver is usable within 1-point RANSAC approaches. Thus, the processing time of the robust estimation is linear in the number of correspondences and, therefore, orders of magnitude faster than by using traditional approaches. The proposed 1AC+D solver is tested both on synthetic data and on 110395 publicly available real image pairs where we used an off-the-shelf monocular depth network to provide up-to-scale depth per pixel. The proposed 1AC+D leads to similar accuracy as traditional approaches while being significantly faster. When solving large-scale problems, e.g., pose-graph initialization for Structure-from-Motion (SfM) pipelines, the overhead of obtaining ACs and monocular depth is negligible compared to the speed-up gained in the pairwise geometric verification, i.e., relative pose estimation. This is demonstrated on scenes from the 1DSfM dataset using a state-of-the-art global SfM algorithm. Source code: https://github.com/eivan/one-ac-pos

    Bottom-up Instance Segmentation using Deep Higher-Order CRFs

    No full text
    Traditional Scene Understanding problems such as Object Detection and Semantic Segmentation have made breakthroughs in recent years due to the adoption of deep learning. However, the former task is not able to localise objects at a pixel level, and the latter task has no notion of different instances of objects of the same class. We focus on the task of Instance Segmentation which recognises and localises objects down to a pixel level. Our model is based on a deep neural network trained for semantic segmentation. This network incorporates a Conditional Random Field with end-to-end trainable higher order potentials based on object detector outputs. This allows us to reason about instances from an initial, category-level semantic segmentation. Our simple method effectively leverages the great progress recently made in semantic segmentation and object detection. The accurate instance-level segmentations that our network produces is reflected by the considerable improvements obtained over previous work at high APr IoU thresholds

    Pixelwise instance segmentation with a dynamically instantiated network

    No full text
    Semantic segmentation and object detection research have recently achieved rapid progress. However, the former task has no notion of different instances of the same object, and the latter operates at a coarse, bounding-box level. We propose an Instance Segmentation system that produces a segmentation map where each pixel is assigned an object class and instance identity label. Most approaches adapt object detectors to produce segments instead of boxes. In contrast, our method is based on an initial semantic segmentation module, which feeds into an instance subnetwork. This subnetwork uses the initial category-level segmentation, along with cues from the output of an object detector, within an end-to-end CRF to predict instances. This part of our model is dynamically instantiated to produce a variable number of instances per image. Our end-to-end approach requires no post-processing and considers the image holistically, instead of processing independent proposals. Therefore, unlike some related work, a pixel cannot belong to multiple instances. Furthermore, far more precise segmentations are achieved, as shown by our substantial improvements at high APr thresholds

    Bottom-up Instance Segmentation using Deep Higher-Order CRFs

    No full text
    Traditional Scene Understanding problems such as Object Detection and Semantic Segmentation have made breakthroughs in recent years due to the adoption of deep learning. However, the former task is not able to localise objects at a pixel level, and the latter task has no notion of different instances of objects of the same class. We focus on the task of Instance Segmentation which recognises and localises objects down to a pixel level. Our model is based on a deep neural network trained for semantic segmentation. This network incorporates a Conditional Random Field with end-to-end trainable higher order potentials based on object detector outputs. This allows us to reason about instances from an initial, category-level semantic segmentation. Our simple method effectively leverages the great progress recently made in semantic segmentation and object detection. The accurate instance-level segmentations that our network produces is reflected by the considerable improvements obtained over previous work at high APr IoU thresholds
    corecore